希望以优先的,有序的方式相结合,因为它允许模块化设计并通过知识传输来促进数据重用。在控制理论中,优先的组合物是通过空空间控制实现的,其中低优先级控制动作被投影到高优先级控制动作的空空间中。这种方法目前无法用于加强学习。我们为增强学习提出了一个新颖的,任务优先的组成框架,其中涉及一个新颖的概念:强化学习政策的冷漠空间。我们的框架有可能促进知识转移和模块化设计,同时大大提高数据效率和增强学习代理的数据重用。此外,我们的方法可以确保高优先级的限制满意度,这使得在机器人技术等安全 - 关键领域中学习有望。与零空间的控制不同,我们的方法允许通过在最初的复合策略构建后在高级政策的无差异空间中在线学习来学习复合任务的全球最佳策略。
translated by 谷歌翻译
任务(SOT)控件允许机器人同时实现根据错误空间中(在)平等约束方面提出的许多优先目标。由于这种方法在每个时间步长求解了一系列二次程序(QP),而无需考虑任何时间状态的演变,因此适用于处理局部干扰。但是,其限制在于处理需要非二次目标才能实现特定目标的情况,以及应对控制干扰的情况,需要在本地进行次优的行动。最近的作品通过利用有限状态机器(FSM)来解决这一缺点,以使机器人不会陷入本地最小值的方式组成任务。然而,反应性和模块化之间的内在折衷是FSM的表征使它们在动态环境中定义反应性行为不切实际。在这封信中,我们将SOT控制策略与行为树(BTS)相结合,该任务切换结构在反应性,模块化和可重复使用方面解决了FSM的某些局限性。 Franka Emika Panda 7-DOF操纵器的实验结果显示了我们框架的稳健性,该框架使机器人可以从SOT和BTS的反应性中受益。
translated by 谷歌翻译
Foundation models can be disruptive for future AI development by scaling up deep learning in terms of model size and training data's breadth and size. These models achieve state-of-the-art performance (often through further adaptation) on a variety of tasks in domains such as natural language processing and computer vision. Foundational models exhibit a novel {emergent behavior}: {In-context learning} enables users to provide a query and a few examples from which a model derives an answer without being trained on such queries. Additionally, {homogenization} of models might replace a myriad of task-specific models with fewer very large models controlled by few corporations leading to a shift in power and control over AI. This paper provides a short introduction to foundation models. It contributes by crafting a crisp distinction between foundation models and prior deep learning models, providing a history of machine learning leading to foundation models, elaborating more on socio-technical aspects, i.e., organizational issues and end-user interaction, and a discussion of future research.
translated by 谷歌翻译
A default assumption in reinforcement learning and optimal control is that experience arrives at discrete time points on a fixed clock cycle. Many applications, however, involve continuous systems where the time discretization is not fixed but instead can be managed by a learning algorithm. By analyzing Monte-Carlo value estimation for LQR systems in both finite-horizon and infinite-horizon settings, we uncover a fundamental trade-off between approximation and statistical error in value estimation. Importantly, these two errors behave differently with respect to time discretization, which implies that there is an optimal choice for the temporal resolution that depends on the data budget. These findings show how adapting the temporal resolution can provably improve value estimation quality in LQR systems from finite data. Empirically, we demonstrate the trade-off in numerical simulations of LQR instances and several non-linear environments.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Quantifying the perceptual similarity of two images is a long-standing problem in low-level computer vision. The natural image domain commonly relies on supervised learning, e.g., a pre-trained VGG, to obtain a latent representation. However, due to domain shift, pre-trained models from the natural image domain might not apply to other image domains, such as medical imaging. Notably, in medical imaging, evaluating the perceptual similarity is exclusively performed by specialists trained extensively in diverse medical fields. Thus, medical imaging remains devoid of task-specific, objective perceptual measures. This work answers the question: Is it necessary to rely on supervised learning to obtain an effective representation that could measure perceptual similarity, or is self-supervision sufficient? To understand whether recent contrastive self-supervised representation (CSR) may come to the rescue, we start with natural images and systematically evaluate CSR as a metric across numerous contemporary architectures and tasks and compare them with existing methods. We find that in the natural image domain, CSR behaves on par with the supervised one on several perceptual tests as a metric, and in the medical domain, CSR better quantifies perceptual similarity concerning the experts' ratings. We also demonstrate that CSR can significantly improve image quality in two image synthesis tasks. Finally, our extensive results suggest that perceptuality is an emergent property of CSR, which can be adapted to many image domains without requiring annotations.
translated by 谷歌翻译
Uncertainty is prevalent in engineering design, statistical learning, and decision making broadly. Due to inherent risk-averseness and ambiguity about assumptions, it is common to address uncertainty by formulating and solving conservative optimization models expressed using measure of risk and related concepts. We survey the rapid development of risk measures over the last quarter century. From its beginning in financial engineering, we recount their spread to nearly all areas of engineering and applied mathematics. Solidly rooted in convex analysis, risk measures furnish a general framework for handling uncertainty with significant computational and theoretical advantages. We describe the key facts, list several concrete algorithms, and provide an extensive list of references for further reading. The survey recalls connections with utility theory and distributionally robust optimization, points to emerging applications areas such as fair machine learning, and defines measures of reliability.
translated by 谷歌翻译
深度强化学习(DRL)是一种仅从演示和经验中学习机器人控制政策的有前途的方法。为了涵盖机器人的整个动态行为,DRL训练是通常在仿真环境中得出的主动探索过程。尽管这种模拟培训廉价且快速,但将DRL算法应用于现实世界的设置很困难。如果对代理进行训练直到它们在模拟中安全执行,则由于模拟动力学和物理机器人之间的差异引起的SIM到真实差距,将其传输到物理系统很困难。在本文中,我们提出了一种在线培训DRL代理的方法,可以使用基于模型的安全主管在实体车辆上自动驾驶。我们的解决方案使用监督系统检查代理选择的操作是安全还是不安全,并确保在车辆上始终采取安全措施。这样,我们可以在安全,快速,有效地训练DRL算法的同时绕过SIM到现实的问题。我们提供各种现实世界实验,在线培训一辆小型实体车辆,可以自动驾驶,没有事先模拟培训。评估结果表明,我们的方法在未崩溃的同时提高了样品效率的训练代理,并且受过训练的代理比在模拟中训练的代理表现出更好的驾驶性能。
translated by 谷歌翻译
财产数据的可用性是化学过程开发中的主要瓶颈之一,通常需要耗时且昂贵的实验或将设计空间限制为少数已知分子。这种瓶颈一直是预测性财产模型持续发展的动机。对于新分子的性质预测,群体贡献方法一直在开创性。最近,机器学习加入了更具成熟的财产预测模型。但是,即使取得了最近的成功,将物理约束集成到机器学习模型中仍然具有挑战性。物理约束对于许多热力学特性,例如吉布斯 - 杜纳姆(Gibbs-Dunham)关系至关重要,它将额外的复杂性层引入预测中。在这里,我们介绍了SPT-NRTL,这是一种机器学习模型,以预测热力学一致的活动系数并提供NRTL参数,以便于过程模拟。结果表明,SPT-NRTL在所有官能团的活性系数预测中的精度高于UNIFAC,并且能够以几乎实验的精度预测许多蒸气 - 液位均衡性,如示例性混合物所示。 N-己烷。为了简化SPT-NRTL的应用,用SPT-NRTL计算了100 000 000的NRTL参数,并在线提供。
translated by 谷歌翻译
Neuralode一词描述了人工神经网络(ANN)和用于普通微分方程(ODES)的数值求解器的结构组合,前者是要解决的ode的右侧。黑盒模型以功能模型单元(FMU)的形式进一步扩展了这一概念,以获得名为NeuralFMUS的神经台阶的子类。最终的结构具有一个单个模拟模型中第一原则和数据驱动建模方法的优势:与常规的第一原理模型(FPM)相比,预测准确性更高,而与纯粹数据驱动的模型相比,培训工作也更低。我们提出了一个直观的工作流程,以设置和使用NeuralFMU,从而可以封装和重用从通用建模工具导出的现有常规模型。此外,我们通过在汽车纵向动力学模型(VLDM)中部署神经FMU来体现这一概念,该模拟是汽车行业中典型的用例。在科学用例中经常忽略的相关挑战,例如实际测量(例如噪声),未知的系统状态或高频不连续性,在此贡献中得到了处理。为了构建比原始FPM更高的预测质量的混合模型,我们简要强调了两个开源库:FMI.JL用于将FMU集成到Julia编程环境中,以及该库的扩展名为FMIFLUX。 JL,这允许将FMU集成到神经网络拓扑中,以最终获得神经FMU。
translated by 谷歌翻译